对于普通话端到端(E2E)自动语音识别(ASR)任务,与基于角色的建模单元相比,基于发音的建模单元可以改善模型培训中的建模单元的共享,但遇到了同音词。在这项研究中,我们建议使用一种新颖的发音意识到的独特字符编码来构建基于E2E RNN-T的普通话ASR系统。所提出的编码是发音基本音节和字符索引(CI)的组合。通过引入CI,RNN-T模型可以在利用发音信息来提取建模单元的同时克服同音问题。通过提出的编码,可以通过一对一的映射将模型输出转换为最终识别结果。我们在Aishell和MagicData数据集上进行了实验,实验结果表明了该方法的有效性。
translated by 谷歌翻译
声学和语言特征是口语识别(LID)任务的重要提示。最近的高级盖系统主要使用缺乏明确语言特征编码的声学特征。在本文中,我们通过将RNN换能器模型集成到语言嵌入框架中,提出了一种基于换能器的新型语言嵌入方法,用于盖子任务。从RNN传感器的语言表示能力的优势中受益,该方法可以利用语音感知的声学特征和盖子任务的明确语言特征。实验是在大规模的多语言Librispeech和Voxlingua107数据集上进行的。实验结果表明,所提出的方法显着提高了盖子任务的性能,分别对内域和跨域数据集的相对改善为12%至59%和16%至24%。
translated by 谷歌翻译
This study demonstrates the feasibility of point cloud-based proactive link quality prediction for millimeter-wave (mmWave) communications. Image-based methods to quantitatively and deterministically predict future received signal strength using machine learning from time series of depth images to mitigate the human body line-of-sight (LOS) path blockage in mmWave communications have been proposed. However, image-based methods have been limited in applicable environments because camera images may contain private information. Thus, this study demonstrates the feasibility of using point clouds obtained from light detection and ranging (LiDAR) for the mmWave link quality prediction. Point clouds represent three-dimensional (3D) spaces as a set of points and are sparser and less likely to contain sensitive information than camera images. Additionally, point clouds provide 3D position and motion information, which is necessary for understanding the radio propagation environment involving pedestrians. This study designs the mmWave link quality prediction method and conducts two experimental evaluations using different types of point clouds obtained from LiDAR and depth cameras, as well as different numerical indicators of link quality, received signal strength and throughput. Based on these experiments, our proposed method can predict future large attenuation of mmWave link quality due to LOS blockage by human bodies, therefore our point cloud-based method can be an alternative to image-based methods.
translated by 谷歌翻译
Factorization machines (FMs) are a powerful tool for regression and classification in the context of sparse observations, that has been successfully applied to collaborative filtering, especially when side information over users or items is available. Bayesian formulations of FMs have been proposed to provide confidence intervals over the predictions made by the model, however they usually involve Markov-chain Monte Carlo methods that require many samples to provide accurate predictions, resulting in slow training in the context of large-scale data. In this paper, we propose a variational formulation of factorization machines that allows us to derive a simple objective that can be easily optimized using standard mini-batch stochastic gradient descent, making it amenable to large-scale data. Our algorithm learns an approximate posterior distribution over the user and item parameters, which leads to confidence intervals over the predictions. We show, using several datasets, that it has comparable or better performance than existing methods in terms of prediction accuracy, and provide some applications in active learning strategies, e.g., preference elicitation techniques.
translated by 谷歌翻译
Offline reinforcement learning (RL) have received rising interest due to its appealing data efficiency. The present study addresses behavior estimation, a task that lays the foundation of many offline RL algorithms. Behavior estimation aims at estimating the policy with which training data are generated. In particular, this work considers a scenario where the data are collected from multiple sources. In this case, neglecting data heterogeneity, existing approaches for behavior estimation suffers from behavior misspecification. To overcome this drawback, the present study proposes a latent variable model to infer a set of policies from data, which allows an agent to use as behavior policy the policy that best describes a particular trajectory. This model provides with a agent fine-grained characterization for multi-source data and helps it overcome behavior misspecification. This work also proposes a learning algorithm for this model and illustrates its practical usage via extending an existing offline RL algorithm. Lastly, with extensive evaluation this work confirms the existence of behavior misspecification and the efficacy of the proposed model.
translated by 谷歌翻译
传统的神经结构倾向于通过类似数量(例如电流或电压)进行通信,但是,随着CMOS设备收缩和供应电压降低,电压/电流域模拟电路的动态范围变得更窄,可用的边缘变小,噪声免疫力降低。不仅如此,在常规设计中使用操作放大器(运算放大器)和时钟或异步比较器会导致高能量消耗和大型芯片区域,这将不利于构建尖峰神经网络。鉴于此,我们提出了一种神经结构,用于生成和传输时间域信号,包括神经元模块,突触模块和两个重量模块。所提出的神经结构是由晶体管三极区域的泄漏电流驱动的,不使用操作放大器和比较器,因此与常规设计相比,能够提供更高的能量和面积效率。此外,由于内部通信通过时间域信号,该结构提供了更大的噪声免疫力,从而简化了模块之间的接线。提出的神经结构是使用TSMC 65 nm CMOS技术制造的。拟议的神经元和突触分别占据了127 UM2和231 UM2的面积,同时达到了毫秒的时间常数。实际芯片测量表明,所提出的结构成功地用毫秒的时间常数实现了时间信号通信函数,这是迈向人机交互的硬件储层计算的关键步骤。
translated by 谷歌翻译
研究过程包括许多决定,例如如何应有资格以及在何处发表论文。在本文中,我们介绍了一个一般框架,以调查此类决策的影响。研究效果的主要困难是我们需要了解反事实结果,而实际上并非现实。我们框架的主要见解是灵感来自现有的反事实分析,其中研究人员将双胞胎视为反事实单位。提出的框架将一对彼此引用为双胞胎的论文。这些论文往往是平行的作品,在类似的主题和类似社区中。我们调查了采用不同决策的双论文,观察这些研究带来的研究影响的进展,并通过这些研究的影响来估算决策的影响。我们发布了我们的代码和数据,我们认为由于数据集缺乏反事实研究,因此这是非常有益的。
translated by 谷歌翻译
找到与治疗效果差异相关的特征对于揭示基本因果机制至关重要。现有方法通过测量特征属性如何影响{\ iT条件平均治疗效果}(CATE)的程度来寻求此类特征。但是,这些方法可能会忽略重要特征,因为CATE是平均治疗效果的度量,无法检测到平均值以外的其他分布参数(例如方差)的差异。为了解决现有方法的这种弱点,我们提出了一个特征选择框架,以发现{\ IT分布处理效果修饰符}。我们首先制定特征重要性度量,该指标量化特征属性如何影响潜在结果分布之间的差异。然后,我们得出其计算高效的估计器,并开发了一个功能选择算法,该算法可以将I型错误率控制为所需级别。实验结果表明,我们的框架成功地发现了重要特征,并优于现有的基于均值的方法。
translated by 谷歌翻译
批量增强学习的缺点是其对数据奖励的要求,因此不适用于无需奖励功能的任务。缺乏奖励的现有设置,如行为克隆,依靠从人类收集的最佳示威。不幸的是,确保最优性需要广泛的专业知识,这阻碍了复杂任务的大规模数据。本文通过从偏好学习奖励功能来解决批量增强学习环境中缺乏奖励。生成偏好只需要对任务的基本了解。作为心理过程,生成偏好比执行演示更快。因此,可以使用众包从非专家人类的规模收集偏好。本文在收集来自非专家人类的数据时出现的危急挑战:偏好中的噪音。提出了一种用于建模标签可靠性的新型概率模型,其利用标签协作。此外,所提出的模型将估计与学习奖励功能平滑。 Atari Datasets的评估展示了拟议模型的有效性,其次是一项消融研究,分析所提出的想法的相对重要性。
translated by 谷歌翻译
“移动”一词的距离(WMD)是测量两个文档相似性的基本技术。作为WMD的关键,它可以通过采用最佳传输配方来利用空间单词的基础几何形状。关于WMD的最初研究报告说,WMD在各种数据集中的大幅度边缘优于古典基线,例如词袋(Bow)和TF-IDF。在本文中,我们指出原始研究中的评估可能会产生误导。我们重新评估了WMD和经典基准的性能,并发现如果我们采用适当的预处理(即L1归一化),经典的基线与WMD具有竞争力。此外,我们引入了WMD和L1拟态化的弓之间的类比,发现不仅WMD的性能,而且距离值都类似于高维空间的弓形值。
translated by 谷歌翻译